ABSTRACT
Now-a-days, there are numerous techniques and ICT tools for the detection of Covid-19. But, these techniques are working with the help;of culminated or peak of symptoms. However, there is a demanding need for the early detection of Covid with self-reported symptoms or even without any symptoms, which makes it easier for further diagnosis or treatment. This research paper proposes a novel approach for the early detection of Covid with the spectral analysis of Cough sound using discrete wavelet transform (DWT), followed by deep convolution neural network (DCNN) based classification. The proposed method with the cough spectral analysis and Deep Learning based algorithm returns the covid infection probability. The empirical results show that the proposed method of covid detection using cough spectral analysis using DWT and deep learning achieves better accuracy, while compared to the conventional methods. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
ABSTRACT
During the COVID-19 pandemic, it has been a standard procedure for people all around the world to use Respiratory Protection Masks (RPM) that cover both the nose and the mouth. The Consequences of wearing RPMs, those pertaining to the perception and production of spoken communication, are rapidly becoming more prominent. Nevertheless, the utilization of face masks also causes attenuation in voice signals, and this alteration affects speech-processing technologies such as Automatic Speaker Verification (ASV) and speech-to-text conversion. An intervention by a deep learning-based algorithm is considered vital to remedy the issue of inappropriate exploitation of speaker-based technology. Therefore, in the proposed framework, a speaker identification system has been implemented to examine the effect of masks. First, the speech signals have been captured, pre-processed, and augmented by a variety of data augmentation techniques. Afterward, different 'Mel-Frequency Cepstral Coefficients' (MFCC) features have been extracted to be fed into a 'Long Short-Term Memory' (LSTM) for identifying speakers. The system's overall performance has been assessed using accuracy, precision, recall, and Fl-score, which yields 93%, 93.3%, 92.2%, and 92.8%, respectively. The obtained results are still in a rudimentary phase, and they are subjected to further enhancements in the future by data expansion and exploitation of multiple optimization techniques. © 2022 IEEE.